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BLOCK MOVE ENGINE WITH GAMMA AND COLOR CONVERSIONS 

Cross Reference to Related Applications 

The present application may relate to co-pending 

application Serial No. 09/ , filed concurrently (Attorney 

Docket 1496.00119) and Serial No. 09/ , filed concurrently 

(Attorney Docket 1496.00154) , which are each hereby incorporated by 
reference in their entirety. 

Field of the Invention 

The present invention relates to a method and/or 
architecture for implementing block modify and move engines (BMMEs) 
generally and, more particularly, a method and/or architecture for 
implementing color and gamma correctors that may be used within the 
data modification section of a BMME. 

Background of the Invention 

The implementation of a block move engine (BME) (a bit 
blitter or blitting engine) for rapidly copying blocks of graphics 
data from one location in memory to another is generally used for 
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graphics processing. BMEs may be extended to include two or more 
input data streams of identical size which are combined by a 
logical composition operation and written back to memory as a 
single data block. The demand for improvements in graphics speed 
5 and resolution and the convergence of video and graphics 
applications onto common platforms has made it is desirable to 
incorporate a wider selection of functions within the general 
structure of a BME. 



1 0^ Summary of the Invention 

\l T he present invention concerns an apparatus comprising a 

s' first circuit and a second circuit. The first circuit may be 
configured to present a first portion of an output data stream in 
p% response to a first portion of an input data stream. The second 
15' circuit may be configured to present a second portion of the output 
data stream in response to a second portion of the input data 
stream. The apparatus may be configured to perform color and gamma 
correction on the input data stream to generate the output data 
stream in response to one or more control signals. In one example, 
2 0 the apparatus may comprise block move engine (BME) . 
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The objects, features and advantages of the present 
invention include providing a method and/or architecture for 
implementing color and gamma correctors that may be suitable for 
inclusion within the data modification section of a BMME that may 
implement color and gamma correction. 

Brief Description of the Drawings 

These and other objects, features and advantages of the 
present invention will be apparent from the following detailed 
description and the appended claims and drawings in which: 

FIG. 1 is a block diagram of a preferred embodiment of 
the present invention; 

FIG. 2 illustrates a context of the present invention; 

FIG. 3 is a detailed block diagram of the circuit of 

FIG. 1; 

FIG. 4 is a detailed block diagram of the color corrector 
circuit of FIG. 3; 

FIG. 5 is a detailed block diagram of an example of the 
color corrector arithmetic circuit of FIG. 4; 

FIG. 6 is a detailed block diagram of the gamma corrector 
circuit of FIG. 3; and 
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FIG. 7 is a graph illustrating an operation of the gamma 

table . 

Detailed Description of the Preferred Embodiments 

Referring to FIG . 1 , a block diagram of a circuit (or 
system) 10 0 is shown in accordance with a preferred embodiment of 
the present invention. The circuit 100 may provide a color and 
gamma correction within the data modification section of a block 
modify and move engine (BMME) . 

The circuit 100 generally comprises a delay block (or 
circuit) 102 and a correction block (or circuit) 104. The circuit 
100 may have an input 106 that may receive a signal (e.g., 
FRONTIN) , an input 108 that may receive one or more coefficient 
and/or offset signals (e.g., COEFFS and OFFSETS) , an input 110 that 
may receive a signal (e.g., GAMEN) and an output 112 that may 
present a signal (e.g., FRONTOUT) . A portion of the signal FRONTIN 
(e.g., ALPHA) may be presented to an input 12 0 of the delay circuit 
102, another portion of the signal FRONTIN (e.g., CC) may be 
presented to an input 122 of the correction circuit 104. The alpha 
data ALPHA may be data corresponding to a first color component. 
The color components data CC may be corresponding to a second color 
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component. The correction circuit 104 may also receive the 
coefficient signals COEFFS, the offset signals OFFSETS and the 
signal GAMEN. The delay circuit 102 may present a portion of the 
signal FRONTOUT and the correction circuit 104 may present another 
5 portion of the signal FRONTOUT. The various signals of the present 
invention may be implemented as single-bit or multi-bit signals. 

Color correction and gamma correction may both be valid 
operations on color components of graphics and video data. 
W However, such correction may not be relevant to alpha data. A 
lOpJ bypass path via the delay 102 may be provided for the alpha data 
ALPHA, when applicable. Delay for color components CC through the 
*» correction circuitry 104 may be matched by the delay 102 for the 
alpha channel ALPHA. The color components CC may be RGB or YUV for 
most graphics and video operations. However, other appropriate 
15" color components may be implemented to meet the design criteria of 
a particular implementation. 

Referring to FIG. 2, a context of the present invention 
is shown. The details of FIG. 2 are described in co-pending 

application Serial No. 09/ , filed concurrently (Attorney 

20 Docket 1496.00154). 
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Referring to FIG. 3, a more detailed diagram of the 
circuit 100 is shown. In particular, the correction circuit 104 
generally comprises a color corrector circuit 130, a gamma 
corrector circuit 132 and a multiplexer 134. The color corrector 
circuit 130 may present a signal (e.g., CC) to the gamma corrector 
circuit 132 and the multiplexer 134. The gamma corrector circuit 
132 may also present a signal (e.g., CC') to an input to the 
multiplexer 134. The signal GAMEN is generally presented to a 
select input of the multiplexer 134. The gamma corrector circuit 
132 may be bypassed via the multiplexer 134. The multiplexer 134 
may select either the processed data CC ' from the gamma corrector 
132 or the bypassed data CC in response to the enable/control 
GAMEN. The multiplexer 134 may present a portion of the output 112 
by selecting the signal CC from the color corrector circuit 13 0 or 
the signal CC" from the gamma corrector circuit 132. A bypass 
path for the color corrector circuit 13 0 may not be necessary, 
since the correction coefficients COEFFS and offsets OFFSETS may be 
set to make the color corrector circuit 130 transparent. 

The color corrector circuit 13 0 may implement the 
equation EQ1 : 
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, where the variables Ri , Gi , and Bi are typically the 
5 input color components. The variables Ro, Go, and Bo typically 
correspond to output components. The variables Rif, Gif, and Bif 
are typically input offsets. The variables Rof, Gof, and Bof are 
typically output offsets. The variable Crr may be a coefficient 
that determines a value (e.g., Ri+Rif) which is added to make the 
10p| variable Ro. The variable Cgr may be a coefficient that determines 
M a value (e.g., Gi+Gif) which is added to make the variable Ro. The 
* aa variable Crb may be a coefficient that determines a value (e.g., 
Bi+Bif) which is added to make the variable Ro. In one example, 
lj the coefficients and offsets may be signed values. For simplicity, 
15 the RGB color format is used for illustrative purposes. However, 
other color components (e.g., YUV) , or a mixture thereof may be 
implemented to meet the design criteria of a particular 
implementation. 

The resulting output from the equations may be limited to 
20 prevent illegal values being propagated to other parts of the 
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circuit 100. For example, extremely high (or positive) values may 
be clipped to the maximum level and extremely low (or negative) 
values may be clipped to the minimum. 

With sufficient range and accuracy for the coefficients 
COEFFS and the offsets OFFSETS, and on the intermediate calculation 
values, corrections to brightness (e.g., offset) and contrast 
(e.g., gain) on any or all color components may be achieved. Such 
a configuration may swap between YUV and RGB color spaces in both 
directions . 

In one example, the implementation of the equations may 
be achieved directly using nine multipliers and multiple adders. 
Such a configuration may calculate the result quickly. However, a 
large amount of circuitry may be used. In another example, the 
implementation of the equations may be achieved using a single 
multiplier and registers to hold intermediate results. However, 
such a configuration may be slow and not practical, since the area 
of the extra registers and multiplexers required outweigh the 
benefit of cutting down on multipliers. A preferred implementation 
may implement three multipliers and may be achieved by calculating 
the results over three clock cycles (to be described in more detail 
in connection with FIG. 4) . 

8 
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Referring to FIG . 4, a more detailed diagram of the color 
corrector circuit 130 is shown. The circuit 130 generally 
comprises one or more summing circuits 150a-150n, one or more 
multiplexer circuits 152a-152n and one or more arithmetic circuits 
154a-154n. The arithmetic circuits 154a-154n may be implemented as 
color corrector arithmetic circuits. The color corrector circuit 
130 is shown receiving a number of input color components, input 
offsets, and coefficients. The arithmetic circuits 154a-154n may 
be similar for each of the three color component channels. The 
signal COLSTEP may be an internally generated signal that may step 
through the three color components to control the ordering of 
calculations on the three components. 

Under the control of the counter signal COLSTEP, the 
coefficients and data inputs to the CC arithmetic circuits 154a- 
154n may be multiplexed such that all three channels R, G and B to 

(i) generate R data and associated coefficients on the first clock, 

(ii) generate G data and associated coefficients on the second 
clock, and (iii) generate B data and associated coefficients on the 
third clock. Input offset calculations may be performed prior to 
multiplexing. However, it may be beneficial to multiplex Ri, Gi, 
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Bi and Rif, Gif, Bif pairs to share a single adder. Such a 
configuration may depend on system performance requirements. 

The adder 150a may receive the signal Ri and the signal 
Rif . The adder 150b may receive the signal Gi and the signal Gif. 
The adder 150n may receive the signal Bi and the signal Bif. The 
adders 150a- 150n may present outputs to the multiplexer 152a. The 
multiplexer 152a may have a select input that may receive the 
signal COLSTEP. The multiplexer 152a may present a signal (e.g., 
SINGLECOL) in response to the signal COLSTEP and outputs of the 
adders 150a-150n. The signal SINGLECOL may be presented to the 
circuits 154a-154n. 

The multiplexer 152b may receive the signal Crr, the 
signal Cgr and the signal Cbr. The multiplexer 152b may have a 
select input that may receive the signal COLSTEP. The multiplexer 
152b may present a signal (e.g., RCOEFFS) in response to the 
signals Crr, Cgr, Cbr and COLSTEP. The signal RCOEFFS may be 
presented to the CC Arithmetic circuit 154a. The CC Arithmetic 
circuit 154a may also receive the signal Rof and generate the 
signal Ro. 

The multiplexer 152c may receive the signal Crg, the 
signal Cgg and the signal Cbg. The multiplexer 152c may have a 
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select input that may receive the signal COLSTEP. The multiplexer 
152c may present a signal (e.g., GCOEFFS) in response to the 
signals Crg, Cgg, Cbg and COLSTEP. The signal GCOEFFS may be 
presented to the CC Arithmetic circuit 154b. The CC Arithmetic 
5 circuit 154b may also receive the signal Gof and generate the 
signal Go. 

The multiplexer 152n may receive the signal Crb, the 
signal Cgb and the signal Cbb. The multiplexer 152n may have 
% ll select input that may receive the signal COLSTEP. The multiplexer 
10f| 152n may present a signal (e.g., BCOEFFS) in response to the 

SI signals Crb, Cgb, Cbb and COLSTEP. The signal BCOEFFS may be 

t y 

: f presented to the CC Arithmetic circuit 154n. The CC Arithmetic 
circuit 154n may also receive the signal Bof and generate the 
p! signal Bo. 

15 Referring to FIG. 5, a more detailed diagram of the CC 

arithmetic circuit 154a is shown. The CC Arithmetic circuits 154b 
and 154n may have similar implementations. The circuit 154a may 
have an input 160 that may receive the signal Rof, an input 162 
that may receive the signal SINGLECOL, an input 164 that may 

2 0 receive the signal RCOEFFS, an input 166 that may receive the 
signal COLSTEP and an output 168 that may present the signal Ro. 

11 
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The circuit 154a generally comprises a circuit 170, a 
circuit 172, a circuit 174, a circuit 176, a circuit 178 and a 
circuit 180 . The circuit 170 may be implemented as a 

multiplication circuit. The circuits 172 and 174 may be 
implemented as adder circuits. The circuit 176 may be implemented 
as a limit circuit. The circuit 178 may be implemented as a 
multiplexer circuit. The circuit 180 may be implemented as a 
register circuit. 

The multiplier 170 may be configured to calculate a 
single term of the matrix multiplication on each clock cycle. The 
outputs may be added together by storing intermediate sums (e.g., 
PART SUM) in the register 180. After three clock cycles, the value 
PARTSUM may contain the full result. 

A first input of the multiplexer 178 may receive an 
output from the adder circuit 172 (through the register circuit 
180) . A second input of the multiplexer 178 may receive a digital 
"0". A select input of the multiplexer 178 may receive the signal 
COLSTEP. The multiplexer 178 may ensure that a u 0" gets added to 
the first multiplier result of each group of three multiplies. 
Such a configuration may effectively clear the register 18 0 from 
one pixel calculation to the next. The output offset may then be 
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added to the result with the final value Ro being limited by the 
limiter 176 to remove illegal values. 

Referring to FIG. 6, a more detailed diagram of the gamma 
corrector circuit 132 is shown. The gamma corrector circuit 132 
generally comprises a number of gamma tables 190a-190n. An input 
R is modified by the gamma table 190a to present an output R' . 
Similarly, an input G is modified by the gamma table 190b to 
present an output G' . An input B is modified by the gamma table 
190n to present an output B' . The gamma tables 190a- 190n may be 
implemented as similar gamma look-up tables. 

Each of the gamma look-up tables 190a- 190n may be 
implemented for one of the color components RGB or YUV (although it 
may not be necessary to perform gamma correction on YUV) . The 
tables 190a-190n may be implemented as a ROM. However; the tables 
190a- 190n may be implemented as another appropriate type memory 
element in order to meet the design criteria of a particular 
implementation. However, the gamma look-up tables may be 
configured to convert the data according to the equation EQ2 : 
EQ2: Output = K * (Input/ J) 1/Y 

.where K and J are constants which depend on how the RGB 
values are represented numerically in the system, and y is the 
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constant gamma correction factor (typically between 2.2 and 2.4), 
FIG. 7 shows a typical diagram of a transfer function 200 of the 
gamma table 132 . However, modifications to the transfer function 
may be made to meet the design criteria of a particular 
implementation . 

While the invention has been particularly shown and 
described with reference to the preferred embodiments thereof, it 
will be understood by those skilled in the art that various changes 
in form and details may be made without departing from the spirit 
and scope of the invention. 
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